Picture for Ying Shan

Ying Shan

Semantic Generative Tuning for Unified Multimodal Models

Add code
May 18, 2026
Viaarxiv icon

Pixal3D: Pixel-Aligned 3D Generation from Images

Add code
May 11, 2026
Viaarxiv icon

Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers

Add code
Apr 23, 2026
Viaarxiv icon

OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video

Add code
Apr 13, 2026
Viaarxiv icon

CutClaw: Agentic Hours-Long Video Editing via Music Synchronization

Add code
Mar 31, 2026
Viaarxiv icon

Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels

Add code
Mar 05, 2026
Viaarxiv icon

CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video

Add code
Mar 04, 2026
Viaarxiv icon

MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

Add code
Feb 09, 2026
Viaarxiv icon

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

Add code
Jan 08, 2026
Viaarxiv icon

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Add code
Dec 23, 2025
Viaarxiv icon